DFA Learning of Opponent Strategies
نویسندگان
چکیده
This work studies the control of robots in the adversarial world of “Hunt the Wumpus”. The hybrid learning algorithm which controls the robots behavior is a combination of a modified RPNI algorithm, and a utility update algorithm. The modified RPNI algorithm is a DFA learning algorithm, to learn opponents’ strategies. An utility update algorithm is used to quickly derive a successful conclusion to the mission of the agent using information gleaned from the modified RPNI.1
منابع مشابه
Avg. Model Size vs. Changes No. Model Size Error vs. Changes No
model of its opponent's strategy based on its past behavior, and uses the model to predict its future behavior. We represent an interaction between agents by a repeated-game and restrict our attention to opponent strategies that can be represented by DFA. Learning a minimal DFA without a teacher was proved to be hard. We presented an unsu-pervised algorithm, US-L , based on Angluin's L algorith...
متن کاملUsing a Priori Information for Fast Learning Against Non-stationary Opponents
For an agent to be successful in interacting against many different and unknown types of opponents it should excel at learning fast a model of the opponent and adapt online to non-stationary (changing) strategies. Recent works have tackled this problem by continuously learning models of the opponent while checking for switches in the opponent strategy. However, these approaches fail to use a pr...
متن کاملWinning Opponent Counter Strategy Selection in Holdem Poker
The game of poker presents an interesting and complex problem for game theorists and researchers in machine learning. Current work on the subject focuses on how to develop optimal counter strategies, often referring to the Upper Confidence Bounds (UCB1) algorithm to determine which of these counter strategies is optimal for an unknown opponent. We present a new method for taking a learned set o...
متن کاملCombining Opponent Modeling and Model-Based Reinforcement Learning in a Two-Player Competitive Game
When an opponent with a stationary and stochastic policy is encountered in a twoplayer competitive game, model-free Reinforcement Learning (RL) techniques such as Q-learning and Sarsa(λ) can be used to learn near-optimal counter strategies given enough time. When an agent has learned such counter strategies against multiple diverse opponents, it is not trivial to decide which one to use when a ...
متن کاملUsing iterated reasoning to predict opponent strategies
The field of multiagent decision making is extending its tools from classical game theory by embracing reinforcement learning, statistical analysis, and opponent modeling. For example, behavioral economists conclude from experimental results that people act according to levels of reasoning that form a “cognitive hierarchy” of strategies, rather than merely following the hyper-rational Nash equi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998